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Abstract—This text reviews the state of the art of the Dis- 
continuous Galerkin (DG) method applied to the solution of 
the Maxwell’s equations in Time Domain (TD). The work is 
divided into two parts. In the first part, the mathematical 
formulation of the DGTD method, together with a review and a 
discussion on the different ways to implement it is presented. 
The second part presents models and techniques to address 
usual needs in electromagnetic simulations such as plane wave 
illumination, local electromagnetic sources, wave port modeling, 
dispersive and/or anisotropic materials and sub-cell models, 
including lumped elements, thin layers, surface impedances, and 
thin wires. 


Index Terms—Discontinuous Galerkin Methods, Maxwell’s 
equations, Computational Electromagnetics 


I. INTRODUCTION 


In the last years Discontinuous Galerkin time-domain 
(DGTD) techniques have reached a significant level of ma- 
turity demonstrating their capability of obtaining highly accu- 
rate results at an affordable computational cost. They have 
successfully been used to solve many kinds of differential 
equations in fields including: Computational Fluid Dynamics 
[1], Magnetohydrodynamics [2], Quantum mechanics [3], and 
Elastodynamics [4], [5]. In this paper, we will review some 
of the existing DGTD techniques with special emphasis on 
applications to Computational Electromagnetics (CEM). 

The DG spatial discretization permits us to take advantage 
of using unstructured high-order finite elements. This allows 
an accurate discretization of the geometry using different sizes 
and types of cells (h-adaptivity), and also to obtain high-order 
convergence of the electromagnetic (EM) solution depending 
on the order of basis functions within each cell (p-adaptivity). 
The TD nature of this method, compared to its frequency- 
domain (FD) counterpart, offers benefits in several kinds of 
EM problems such as those where we need to study a transient 
field effect of an arbitrary time-signal excitation (e.g. lightning 
strikes, EMC coupling, ultra-wideband antennas), or the non- 
linear behavior of materials, components or networks, where 
TD offers a direct and efficient approach. To solve these 
problems, we have developed a solver: SEMBA (Simulador 
Electromagnetico de Banda Ancha) [6]. SEMBA implements 
many of the techniques that are reviewed in this text such 
as vector/nodal basis, centered/penalized/upwind fluxes, spe- 
cial materials, Local Time Stepping (LTS) techniques, hp- 
adaptivity, and OpenMP/MPI parallelization. These techniques 
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have been thoroughly tested in a wide range of problems, 
demonstrating to provide robust and efficient solutions [7]— 
[23]. 

However, the aim of this paper is not restricted to describing 
only those techniques directly tested by ourselves, but also 
to make a full review of the state-of-the-art including contri- 
butions found in the most recent literature. The rest of this 
work is divided into three parts: The first part (Sections II 
and III) provides a general review of the state-of-the-art. The 
second part (Sections IV and V) describes the mathematical 
foundation of the method, alternative introductory texts on this 
topic are [24]-[26]. Much of what is done in this part can be 
generalized to other partial differential equations (PDEs). The 
third part of this work (Sections VI, VI, VII, and IX) is 
focused on the application of this method to real engineer- 
ing problems. Several techniques to simulate electromagnetic 
sources, special materials (such as dispersive and anisotropic), 
and sub-cell models are introduced. Another text discussing 
some of these topics is [27]. 


II. OVERVIEW OF TIME-DOMAIN NUMERICAL METHODS 


Let us start by situating DGTD in the context of some of 
the most common TD full-wave numerical methods used in 
CEM. Their main typical features are summarized in table I. 


A. Finite Differences in Time Domain (FDTD) 


FDTD is a mature technique that has been extensively 
developed for over 50 years. The classical FDTD method 
[33] employs a second order finite centered approximation 
for space and time derivatives in Maxwell’s curl equations. 
This technique places the samples of the electric field in a 
rectilinear Cartesian grid while the magnetic field is sampled 
in the dual of this grid, resulting in what is known as the Yee’s 
cell [34]. The fields are then advanced in a marching-on-in- 
time fashion using a second order leap-frog (LF2) algorithm. 
The final scheme is second order convergent with respect to 
spatial and temporal refinement. 

The main advantages of the FDTD method are its com- 
putational efficiency, its naturally spurious-free solutions and 
the fact that it conserves energy. On the other hand, the need 
of a rectilinear grid and the staggering of the fields sampling 
imply a high degeneration of the geometrical information due 
to staircasing effects. However, the FDTD method can be used 
together with geometrically conformal [29] or subgridding 
[30] techniques that alleviate this limitation. Higher-order 
FDTD techniques can be formulated, but they require a larger 


TABLE I: Comparative summary of numerical methods with typical formulations 





FDTD FVTD DGTD FEMTD (others) 

Order of accuracy4 ? h° h h?P+I h2P 
Geometry adaptivity Nof Yes Yes Yes 
Spurious modes No Yes/No Yes°/Nof Yes8/No! 
Energy conservative? Yes Yes Yes°/Nof Yes 
Explicit form Yes Yes Yes No! 
LTS, IMEX or similar No Yes Yes No 
Parallel. simplicity High High High Low 
Memory usage! High Very High Low Very Low 
Memory locality Very High Low High“ High* 
Uses dual mesh Yes No No! No 
Allows non-conformal mesh No Yes Yes No 

h adaptivity Yes Yes Yes Yes 

p adaptivity No No Yes Yes 

















“For global L? norm. 

’Considering spatial semi-discretization only. 

‘Higher order spatial semi-discretizations are also available [28]. 
4Can be alleviated with conformal [29] and subgridding [30] techniques. 
“With centered fluxes. 

/With penalized fluxes. 

8For nodal basis 

"For vector basis. 

‘But can be approximated [31]. 

/For a structured mesh. Not considering semi-discretized operators. 
‘For high orders. 

‘Improvements using a dual mesh have been reported by [32]. 


stencil [28], [35], [36] that reduces significantly its computa- 
tional efficiency. The Finite Integration Technique (FIT) and 
the Transmission Line Method (TLM) are closely related to 
FDTD. FIT starts from Maxwell’s curl equations in integral 
form and TLM from equivalent transmission-line equations. 
The resulting algorithms share most of the features that we 
find in the FDTD method. 


B. Finite Volume Methods (FVTD) 


The FVTD technique emerged as an alternative to FDTD 
aiming to overcome its geometrical discretization constraints, 
avoiding the staggered spatial discretization of the fields. The 
most common formulation of FVTD is carried on tetrahedral 
elements for the solution of Maxwell’s curl equations [37]- 
[39]. The scheme is formulated by defining a system of equa- 
tions in which the time derivative of the E fields integrated 
in volume equals to the sum of all surface integrals of the 
spatial derivative of Ë terms and vice versa. The spatial semi- 
discretization is then evolved, similarly to the FDTD method, 
using an LF2 algorithm. The main drawback of FVTD is 
that its order of convergence is 1 [40], which is quite low. 
Moreover, the timestep is limited by a condition that depends 
on the shape of the elements which is more restrictive than 
for the FDTD method. A way to mitigate this time-stepping 
constraint is to use local time stepping (LTS) techniques [37]. 
LTS can also be used for DGTD as we will describe in Section 
V. FVTD can be formally seen as a zero-order DGTD method. 


C. Finite Element Methods (FEMTD) 


A variety of time-domain FEM schemes has been proposed 
[41] based on Maxwell’s curl-curl equation or the hyperbolic 
system of curl equations (Ampere’s and Faraday’s laws). The 


second-order vector-wave curl-curl equation, typically solved 
by FEM in FD, can also be solved by FEM in TD [42]-[51] 
requiring only the computation of a single field (electric or 
magnetic). Its major drawback is that a global linear system 
of equations needs to be solved at each timestep. To reduce the 
number of timesteps, unconditionally implicit time-integration 
schemes, such as Newmark-beta can be used, at the expense 
of yielding quite ill-conditioned matrices [52]. 

Alternatives to the single-field scheme are found by em- 
ploying the two first-order coupled Maxwell’s curl equations, 
either formulated by considering the electric field intensity E 
and the magnetic flux density B (E-B), or the electric field 
intensity E and magnetic field intensity A (E-H). These for- 
mulations offer certain advantages with respect to the single- 
field formulation, such as the possibility of using different ex- 
pansion functions and avoiding spurious solutions. Moreover, 
the first-order time derivatives allow the use of a conventional 
LF2 time-integration method, eliminating the need of saving 
previous states in memory. However, because of the tangential- 
continuity condition, they still require solving a sparse linear 
system at each timestep, resulting in a computational cost 
comparable to that of the single-field scheme [53]-[58]. 


D. Discontinuous Galerkin Methods 


A different family of FEM is found by relaxing the 
tangential-continuity condition, yielding the so-called discon- 
tinuous Galerkin methods (DGM). The continuity is imposed 
on numerical fluxes rather than on the tangential field com- 
ponents, in order to connect the solution between adjacent 
elements. The main advantage of DGM over other FEM 
methods in TD is the fact that the linear system to be solved 
becomes block-diagonal by requiring only a single inversion 
of K square matrices of N x N elements (with K the 


number of elements and N the number of basis functions per 
element) which can be done at the pre-processing stage. One 
of the drawbacks is that the degrees of freedom (DOF) at the 
element interfaces are duplicated, a minor price considering 
the improvement in computational efficiency of the resulting 
explicit semi-discrete scheme [10], [19], [24], [59]. 


III. APPLICATIONS OF DGTD 


The spectrum of engineering applications of CEM is ex- 
tremely wide, giving rise to a need of different numerical 
methods as there is not an ever-suitable method capable 
of solving all types of real-world EM problems [60]. The 
following is a non-exhaustive list of some areas where DGTD 
methods can be of particular interest: 


A. Multi-scale problems 


Any problem exhibiting disparate sizes, such as IC packag- 
ing or in-place antenna simulations, can greatly benefit from 
a DGTD approach [20], [22], [26], [61], [62] combined with 
LTS and hp adaptivity techniques. 


B. Electromagnetic Compatibility (EMC) 


EMC problems are an increasingly big concern for the in- 
dustry. Aircraft and car manufacturers perform CEM analysis 
to detect and solve possible EMC issues in eager stages of 
design. Simulations are usually performed to back measured 
data. However, sometimes measurements are difficult to per- 
form, as in the case of lightning strikes or High-Intensity 
Radiated Fields (HIRF), and manufacturers have no other 
option than to rely exclusively on simulations. The degree of 
confidence put into these simulations is such that they have 
been allowed as valid tests for certification purposes [15], 
[63]. To perform EMC simulations, a DGTD code often must 
include models for sub-cell thin wires and composite layers 
(Section VIII) [14], [64], [65]. Moreover, these simulations 
are usually performed over electrically large problems and a 
high-performance and accuracy simulation is a requirement. 


C. Antennas 


An essential characteristic for the accurate simulation of 
wideband antenna systems is the modeling of their intricate 
geometrical details [12], [22]. In these kind of structures, 
an accurate modeling is critical in zones with small geo- 
metrical details, such as feeding ports. Frequency domain 
(FD) methods, such as the Method of Moments (MoM) or 
the Finite Element Method (FEM), are the usual choices 
for their capability of accurately modeling fine geometrical 
details. However, FD methods may become computationally 
inefficient for ultra-wideband analysis, since each frequency 
needs a complete simulation, typically involving a linear sys- 
tem resolution. Time-domain methods are a natural alternative 
for these purposes. Among them, DGTD methods are ideally 
suited for this purpose. LTS techniques, which are reviewed in 
Section V allow us to handle antenna geometries efficiently. 
Some techniques to model ports are described in section VI-C. 


D. Waveguides 


Like antennas, the simulation of waveguides usually needs 
the modeling of intricate geometries where DGTD offers 
an efficient solution. TD simulations allow us to estimate 
the resonant frequencies of these structures in a single run. 
Waveguides are usually resonant structures where the absence 
of spurious modes is a must [10], a discussion on how to 
keep spurious modes under control is carried out in Section 
IV-E. Moreover, the electromagnetic waves, often imping at 
grazing angles of incidence at the terminations of the waveg- 
uide. This makes necessary the use of an special treatment 
at the terminations, such as the use of Perfectly Matched 
Layers (PMLs) described in Section VII-A, or the multi-modal 
pseudo-analytical termination presented in [66]. 


E. Radar Cross Section 


The analysis of the Radar Cross Section (RCS) of aircrafts 
can also be carried out with DGTD methods in an efficient 
manner [18], [20]. In this case, the TD nature allows us 
to efficiently perform monostatic RCS in a single run. The 
LTS technique, PMLs, Huygens sources (Section VI-A), and 
the ability to model complex geometries enables this task. A 
comparison with other numerical techniques is presented in 
[18] where, for a broadband solution, the DGTD method is 
demonstrated to be competitive with other methods classically 
used for this task, such as the MoM. 


F. Ground Penetrating Radar (GPR) 


GPR techniques can benefit from simulations when new 
antennas are being designed or complex geometries are under 
study. Simulations can be helpful in understanding in-the- 
field obtained data. Moreover, ground materials are usually 
dispersive and non-homogeneous. FEM and DGTD can model 
materials with gradually changing electrical properties in cells. 
Dispersive and lossy ground media can also be included [9], 
[67]. 


IV. THE DISCONTINUOUS GALERKIN METHOD 


Maxwell’s curl equations for lossless isotropic linear media 
without sources are 


oH = 
OE = 
EAE = VxH (2) 


with € and p being, respectively, the electric permittivity 
and the magnetic permeability that we will assume to be 
homogeneous and isotropic for simplicity. Dispersive media 
are treated in Section VII-B. 


A. The Galerkin method 


Let us call Q the region where we want to solve equations 
(1) and (2) applying a DG formalism. This region Q is 


tessellated with K non-overlapping elements V;, fully covering 
the computational domain Nx 


K 


x9r =| ]v (3) 
k 


For simplicity, we will suppress the subscripts k everywhere 
except in V;, uniquely identifying the element where we 
are working. We assume that they can be inferred from the 
element where the Galerkin integrals are carried on. Within 
each element, the fields E(t,*) and H(t,7) are approximated 
by a projection over a set of N vector-basis functions 


= [BE dM, ... dx} (4) 


The Galerkin problem consists of minimizing the inner 
product of the fields, projected over B with respect to each 
of the functions basis (4) within each element V;, leading to 
formulate eqs. (1) and (2) in weak form as 





Ja pitva dV =0 (5) 
vi at 
[be |GR-wxa dV =0 (6) 
i at 





with: i = 1,... N 


Let us now explicitly write the approximation E and H as 
the projection over the same basis, B. Thus 


api Szo OF) = PE (7) 
ha OPOR =4"H (8) 
j 
with 
= E ET 
y= [n-En] (9) 
E = [E},..., ER] (10) 
H = |H}... , H} (11) 


Inserting (7) and (8) into (5) and (6) we obtain the Galerkin 
semi-discretization 


= H 
fe [wor Zs (vx BE] v=o (12) 
[ & [ew S- (vx BH] a V=0 (13) 
with: i = 1,... N 


The first terms of eq. (12) and (13) serve to introduce the 
mass matrix, M, 


KE Í EDOR (14) 


The curl terms of eq. (12) and (13) result in the stiffness 

matrix, S, 
Sly = | Wi) V x plr) dV (15) 

Vk 

In the form stated in (12) and (13), we are not specifying 
how the tangential components of the fields within each 
element relate to each other. If we enforce the fields, Eh 
and H” to be globally continuous, this technique is called 
the Continuous Galerkin Time Domain (CGTD) or FEMTD 
method. The approximated fields then fulfill in a strong way 


fp x Ef = fig x Bet (16) 
ne x HY = ny x ae (17) 


where the superscript + indicates the field neighboring the 
element across face f. The main drawback of the resulting 
algorithm is that it requires the solution of a large set of linear 
equations. 


B. Numerical flux 


The DG method [68] relies on enforcing the continuity 
of the numerical flux across face f rather than the field 
components as in (16). Using basic vector identities, the curl 
terms in (12) (and similarly in (13)) can also be expressed as 


wil?) -(V x Ef) dV = 
Vk 


= V-(E} x diy avs f (vx di) E} av (18) 
Vk Vk 

= Bi: (x EP) aov) + f (V x pi) E? aV 
Vk Vk 


where ñ is a unit vector pointing outwards the element. The 
first term of the RHS of Eq.18 (A x Er) is substituted by the 


flux function across face f (à x Be) 

Therefore, instead of plugging (16) into (18) and then into 
(12) and (13), we define numerical values of the tangential 
fields on ƏV, f,k)» henceforth called numerical fluxes, E F and 
Äe *, which do not need to match any of the values of the 
tangential fields on any side of OV; but will depend on them. 


fsx BP" on E (DLE R A) 





fag x HY = ay x A ERAN EREN (19) 


An interesting feature of DG methods is that we have several 
possibilities for choosing the numerical flux as long as they 
satisfy the following conditions [69]: 


e Consistency: Ep (Ep Eh, Hh , H) = Bi 


e Continuity: EP is at ies fe continuous. 


ye 


e Monotonicity: ‘ph is a non-decreasing function of E} 
and H? ş and a non- baedi function of Ept and H ae 


The properties of the scheme will greatly depend on the choice 
of the flux [59]. We will focus on the three most common 





Fig. 1: Notation used for the definition of the numerical fluxes. 


TABLE II: Parameters in Eq.(20) to yield centered, upwind, 
and partially penalized numerical fluxes. With Y and Z stand- 














addition to the centered flux of dissipation terms that can be 
tuned to attenuate the spurious modes, and also to improve 
the accuracy. Other fluxes are described in detail in [59], 
such as the Stabilized Upwind flux. Although these fluxes 
have interesting properties, they require the introduction of 
a new DOF that would need to be evolved in time; and 
therefore, increase the computational cost. For this reason, it 
is not common to find them in the DGTD application-oriented 
literature. 


C. Semi-discretized form 


The introduction of (20) in Eq. (18), together with (14) and 
(15) let us write (12) and (13) in the final DG semi-discretized 
form 
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choices: the centered flux, the upwind flux, and the partially 
penalized flux. A general form for all of them is 


fly xX a = 
= fig x (Bh + sell" + vn às x E*N) 
ûs x a = 
= ñy x (F? + sullA"]y -ve ûs x EN) 20) 
with 
[E]; = Ept — BF 
[A]; = Hp* — AP (21) 


Table II shows the expressions for the « and v factors 
for centered, upwind, and partially-penalized numerical fluxes. 
The terms which are multiplied by v factors are known as 
penalization or upwind terms, and come from the solution of 
the Riemann problem [70]. In the case of partially-penalized 
fluxes, those are multiplied by the 7 parameter. These terms 
introduce some dissipation to the scheme [71]-[73], but are 
essential to avoid the propagation of non-physical or spurious 
modes in the computational domain [10], [59] as will be 
shown in Section IV-E, where dissipation rates are numerically 
evaluated in the eigenvalue problem. When v = 0 (centered 
flux), there is no dissipation for either physical or spurious 
modes, at the cost of introducing spectral pollution to the 
method. In between the upwind and centered fluxes, a family 
of partially penalized fluxes can be defined [74], through the 


Ny 
- ab (aig x fie x [[E"]]¢) d(OV) eD 
Dee fun (As k ‘) 


where we have followed a similar procedure for (13). On 
section IV-F2 we will present some particular cases of these 
expressions, depending on the choice of the basis. 


D. Boundary conditions 


The flux conditions which serve to connect adjacent el- 
ements, also serve to directly implement basic boundary 
conditions in a weak form, by simply modifying the jumps 
in (21). 

1) Perfectly Electric Conducting (PEC): The PEC condi- 
tion requires the tangential component of the electric field to 
be null and the tangential magnetic field component to be 
continuous, thus 


fp x ([E" pec = —2 frp x EP 


fp x ([H" ]lpec = 0 (23) 


2) Perfectly Magnetic Conducting (PMC): The PMC con- 
dition is the reciprocal of the PEC one, 


up x ([E"Jlpmc = 0 


fp x ([A]]pmc = —2 ftp x AP (24) 


3) Silver-Mueller Absorbing (SMA): The first-order SMA 
boundary condition [38], [75] is based on considering that the 
fields outside the computational domain _Propagate as plane 
waves normal to the interface, ñx E = ZH or hx H = -Y E. 
To apply this condition to (20) is equivalent to modifying the 
jump terms to 


ñg x [[E"]sma = -ôr x E" 
ûs x [[E"]sma = -ôr x H” (25) 
and using the following constants in (20) 
1 
KE,SMA = 2 
1 
KHSMA = 5 
1 
v = — 
H,SMA = zy 
1 
= = 2 
VE,SMA = 57 (26) 


The SMA boundary condition provides an ideally null reflec- 
tion coefficient for normal incidence. In practice, its perfor- 
mance is reduced by the numerical accuracy of the method 
with its absorbing characteristics rapidly degrading with the 
angle of incidence with respect to surface normal [76]. For 
this reason, it is is usually preferred to use Perfectly Matched 
Layers (PML) to truncate the computational domain. This will 
be described in section VII-A. 


E. Convergence and spurious modes 


Defining a state vector q = [E H]T containing all DOF 
within element k, we can rewrite equations in (22) as a single 
equation that governs the time evolution of the system 


q(t) = 


F4 (Eralt) - Ef af) 
(27) 


where S41 groups the stiffness operators, and FH groups the 
flux operators acting over face f. The operator € f has the 
function of reordering the terms in q to reproduce the algebraic 
system (22). We have also assumed that we use a system of 
units in which £ = u = 1 to simplify the discussion. To further 
simplify this analysis, we will change the basis of the vector 
space in equation (27) by using an invertible operator P in 
(27) that diagonalizes the locally applied operators 


N 
W =-P71(M4)7* | St SS FIE, P (28) 
f=1 
We can also define the eigenmodes as 
p=P'q (29) 
and the external operators as 
Vp = -P (MI) 1 FREFP (30) 


This change of basis lets write equation (27) in the following 
compact form 


Orp(t) (31) 


0+5 v0 


Eq. (31) shows that the system can be stated as a system of 
2N independent first-order ODEs with eigenfrequencies given 
by the eigenvalues of W. This system contains contributions 
coming from the fluxes through the V +P} terms. This result is 
particularly useful in studying the convergence, stability and 
other spectral properties of the scheme. Moreover, as it will be 
discussed in section V, the spectral properties of the scheme 
will have an important role on the maximum time step required 
for stability. 

1) Convergence: The dispersion and dissipation of this 
method can be studied by comparing the computed and 
analytical plane-wave solutions within a computational do- 
main with periodic boundary conditions [12], [19], [24], [72], 
[77]. Therefore, for an initial solution q with wavenumber 
k we obtain eigenmodes, pj as the projection of the initial 
solution on the diagonalized space. Each of these modes 
has a numerical frequency w,;(k) € C corresponding to the 
eigenvalues of W. The imaginary part S[w,;] corresponds to 
the oscillating frequency and the real part R[w,] corresponds 
to the numerical dissipation or amplification of eigenmode pj, 
if any. A necessary condition for convergency is R[w,;] < 0, 
which is always fulfilled by Galerkin methods. The numerical 
phase-velocities supported by the scheme are c;(k) = w,/k. 
Among them, we will label as cy) the one with the phase- 
velocity closest to the analytical one. Therefore, we have, 


1 
Jeu 


or equivalently, that the numerical solution tends to the an- 
alytical one for higher resolutions. As mentioned earlier, the 
numerical flux will impact the dispersion and dissipation of the 
numerical scheme. Table II summarizes the expected disper- 
sive and dissipative convergences for the different numerical 
fluxes [14], [72]. 

2) Spurious modes: The solution provided by a discrete 
approximation must also be spectrally correct. That is, we 
may obtain a low error when the exactness of the solution 
is measured with global parameters in TD, but observe a pol- 
luted spectrum exhibiting non-physical resonances or spurious 
modes in FD. Therefore, we will require certain features from 
the numerical spectrum [24], [78] such as: 


lim cas(k) = (32) 


e Non pollution and completeness of the spectrum of 
eigenvalues and eigenvectors for a suitable resolution. 
e Isolation of the discrete kernel modes. 


A well-known drawback of nodal FEM is the presence of 
spurious modes [79]. For instance, in [24] it is shown that a 
grid sufficiently far from being quasi-structured together with 
a centered flux will make spurious modes arise at relatively 
low frequencies. These are commonly attributed to a variety of 
reasons, including an inexact representation of the underlying 


de Rham complex!. 

An added advantage of DGTD over FEMTD resides in 
its discontinuous nature that permits spurious solutions to be 
removed if we use upwind or penalized fluxes [71], [72], [86], 
[87]. These fluxes are characterized by the addition of dissipa- 
tive terms to Maxwell’s equations, and are proven to attenuate 
spurious modes more strongly than physical modes. The effect 
in the spectrum of the eigenvalues in the use of different fluxes 
can be appreciated in Fig. 3. The use of centered fluxes results 
in all eigenvalues lying on the imaginary axis whether if they 
are physical or not. Fig. 2 shows how this issue translates 
into the spectrum of resonances of a PEC cavity, making the 
identification of physical resonances difficult. 
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Fig. 2: Power spectrum of the electric field at an arbitrary point 
inside a 1 m PEC cavity. The effect of the non-attenuation of 
the centered flux spectrum can be appreciated compared with 
the upwind and penalized fluxes. 


F. Vector/Nodal basis functions 


Let us discuss two families of basis functions here, the 
vector and the nodal basis. 

1) Vector basis functions: Many authors use vector basis 
for the implementation of DG schemes [12], [64], [88]-[90]. 
Vector-curl conforming basis [91]—[94] were first proposed to 
solve spurious-mode problems appearing in solutions using 
scalar basis in FEM [81]. However, these conclusions cannot 
be straightforwardly extrapolated to the DG case and, as has 
been discussed in section IV-C, the spurious-modes issue has 
been solved using penalized fluxes. 

We can define flux matrices based on vector basis functions 
(22) as 


pS = SB — np( FEE F.E) - vn (F) H+- F,H) 
Me = SH+kx#(F{H*- F,H) —ve(F{E*- FE) 


(33) 


'One way of removing this source of spurious modes is to resort to vector- 
based formulations [80]-[82]. Comparing vector and nodal FEM is out of the 
scope of this work; advantages and disadvantages of both have been reported 
in literature [83], [84] and would deserve a full work to be further analyzed. 
Another approach to mitigate spurious modes is by introducing penalty terms 
associated with the divergence of E [59], [85], at the cost of adding extra 
terms, and DOFs, that are to be evolved at each timestep [59]. 
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From an implementation point of view, the main advantage 
of the curl-conforming vector basis functions is that S, F,, 
and FT can be shared by all the elements in a problem, since 
they do not depend on the geometry of the cells (size, aspect 
ratio and curvature) [12]. However, the mass matrices are full, 
with a size of N x N. 

2) Nodal basis functions: With nodal basis functions some 
simplifications are possible. The set of nodal basis functions, 
Bhn, can be seen as a particular case of (4) in which 


„ln ĝ, h2,- ln} (88) 


with N = 3N,, and N, as the number of different scalar 
functions, J; that can be obtained from a simplex element 
formulation [79]. We may be tempted to put the nodes r, 
equidistantly for the sake of simplicity. However, as discussed 
in [24] and [95], the Lagrange basis with nodes located at the 
Legendre-Gauss-Lobatto (LGL) quadrature points is a better 
choice, obtaining low condition numbers for the local matrices 
even at high p orders. 
Using the nodal basis (38), we have that 
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and similarly, for A = Hl. When expressed using the nodal 
basis (38), the operators M and S are composed of blocks that 
decouple some Cartesian components of the vectors [83], 


Mn 
M= Mn (40) 
Mn 
-S? Sy 
S=| & —S? (41) 
-Sh Sa 


Where M,, and S?7¥* have size N, X Np. The flux terms 
are also simplified, as they will now need only to account for 
fields in nodes at face f. When a nodal basis is used, the 
equation (22) can be expressed for non-curved elements, as 
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(a) Centered flux 
(7 = 0.0) 





(b) Upwind flux 
(7 = 1.0) 
































(c) Penalized flux 
(7 = 0.1) 


Fig. 3: Normalized spectrum of the DG operator for a cubic domain (meshed with 24 tetrahedrons) with periodic boundary 
conditions. We can see how the centered flux does not provide an isolated kernel, contrary to the upwind and penalized fluxes. 


TABLE II: Comparison of vector and nodal basis 





Vector Scalar! 
Linear Curved Linear Curved 
M size NxN Np x Np 
S size NxN 3(Np x Np) 
F size 4 sparse matrices | Np x (Nf Nsp) Np x (N¢ Nop)? 
Shared operators® S, Fey Fit M, F None 











“The number of nodes for scalar basis is Np = N/3. The number of face nodes Npp is (p + 1)(p + 2)/2! for tetrahedrons and (p + 1)? for hexahedrons. 
For curved cells, the storage and number of operations needed are significantly higher, varying depending on the implementation. 
‘Without considering identical cells in which all operators can be shared. Neglecting scaling factors. 
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(44) 
has a size of Np x Nfp with Nfp being the number of nodes 
at face f. This implies that nodal basis scale computationally 
better than vector basis when we increase the order of basis 
functions p; being this the main reason why nodal basis is 
usually preferred for high order schemes [24]. Fn and Mn are 
the same for all elements except for a scaling factor; therefore 


we do not need to store them more than once for the entire 
simulation. Note that to to obtain the equations (42) we assume 
that p is constant for the flux integral terms in (22); therefore 
this simplification in the flux integral is not valid if we work 
with curved elements (see Section IV-G). 


G. Curved cells 


One of the most appealing features of DG methods is that 
they can be formulated for higher-order geometric elements 
which offer a better geometrical adaptivity [24], [96]-[99]. 
Most available open-source [100] and commercial meshers 
[101] offer the possibility of meshing with quadratic elements 
and techniques-exist allowing higher orders [102]. Using 
quadratic elements allows us to use fewer of them to accurately 
discretize a curved surface, thus implying that their size can 
be larger. 

The implementation of this technique requires the usage 
of quadrature integrals [103] because the complexity of the 
involved Jacobians needed to transform the reference element 
into the actual mesh element which results in integrals that 
cannot be solved analytically. For nodal basis, this technique 
needs to store information of the operators needed by the 
curved element, thus requires storing one flux matrix (44) 
per cubature point [24] or alternatively, one operator per 
each term in (20) containing a normal unit vector. These 
requirements introduce a significant computational overhead 





Linear 
Mapping 





Fig. 4: Mapping from the reference element for linear (first 
geometrical order) and quadratic (second geometrical order) 
tetrahedrons. 


both in memory, as well as a number of operatins that can be 
a factor, depending on the application. 

To illustrate the possibilities of this approach, Fig. 5 shows a 
comparison of the results obtained with meshes using the same 
number of quadratic and linear elements. It can be appreciated 
that the improved geometry adaptivity provides a better result 
for the same number of elements. 
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Fig. 5: RCS at 450 MHz of a 1m radius PEC sphere meshed 
with the same number of linear and quadratic tetrahedrons and 
a spatial basis of order p = 3. Results obtained with GEG- 
UGR SEMBA software (www.ugrfdtd.es). 


H. Non-conformal and hybrid meshes 


Another advantage of using numerical fluxes to exchange 
information between elements is the possibility of using non- 
conformal interfaces between elements [104]-[107]. This is to 
connect elements that do not share a whole face but a portion 
of it. This feature is interesting because there are applications 
with very intricate geometries where a conformal mesh may 
be very difficult to obtain, or domains that require different 


element sizes. In DG, thanks to the flux functions, the interface 
between non-conformal meshes can be posed in a natural way. 
In [104], [106], [107] the authors find that the convergence of 
the method remains the same as with conformal meshes. 
The use of non-conformal meshes allows us to interface 
tetrahedrons with hexahedrons in the transition regions, as 
demonstrated in [106], [107]; and therefore, use hybrid meshes 
that combine several kinds of elements. Another possibility 
for the transition region is to use pyramidal elements. This 
approach has been studied in [108], [109]. In the case of 
non-conformal meshes, this can be done in a natural way 
by making use of the flux terms. The advantage of using 
hybrid meshes is that we can have the best of two worlds 
[4], [64], [88], e.g. tetrahedrons can adapt better to surfaces in 
complicated geometries, or hexahedrons for the discretization 
of zones where structured mesh could be used, enabling larger 
time steps and a reduced number of degrees of freedom. 


V. TIME INTEGRATION 


In this section, we present two time integration methods that 
are also the most popular choices, in conjunction with the DG 
semidiscretizations presented in the previous section. Table IV 
presents a summary of features of different time integration 
methods. 


A. Leapfrog time integration 


1) Second order leapfrog (LF2): The second-order leap- 
frog method [120] is applied by alternately evolving the E” 
and H”+!/2 fields, arbitrarily defined at times t, and t„ + 
At/2 respectively. This implies that we do not have a fully 
defined state vector in the sense of eq. (27) for a given time t. 
To obtain the future values from a present state, the following 
algorithm is applied 


EX} = E” + At Lh (H°+/2, E") 


Hr+3/2 — H +H/2 4 At ch, (ee pe) (45) 
with L% and L} being the equations (22), respectively. When 
centered fluxes are used, the L% and £L}, operators use only 
H”?+!/2 and E”+! as arguments, respectively. This implies 
that the scheme is reversible in time and will preserve energy 
as long as the timestep used is below a maximum value h; set 
by a CFL-like condition [90], [118], [120]. The use of upwind 
or penalized fluxes would imply the need of averaging between 
the next and previous semi-integer time steps in the dissipation 
terms, thus resulting on a globally implicit scheme. To avoid 
this we need to use a backwards approximation [12], [61] and 
use the last previous known value instead of averaging. 

2) Convergence and spectral properties of the LF2 scheme: 
The study of the full spectrum of W obtained in (28) is also 
useful as its properties impose limitations in regards to the 
time-integration. The LF2 method has the following stability 


requirement on its timestep h, [12], [74], [121] 
he < 1/S weg] (46) 


and therefore is constrained by the largest imaginary part 
among all eigenvalues of (28). Equation (46) needs to solve 
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Fig. 6: Dissipation factors (left) and normalized phase velocities (right) for the LSERK4 (up) and LF2 (down) schemes. All 
the semi-discrete eigenvalues calculated depicted in Fig. 3 must lie within the regions delimited by the thick red line to ensure 
the stability of the fully discrete scheme. LF2 supports two modes arising from two solutions for the growing factor. Only the 
positive one is represented. Please note that the final form of the dissipation and phase velocities depend on the combination 


of the two modes [110].) 


a complex eigenvalue problem for each specific problem. To 
avoid this, we can use heuristic or analytical closed conditions 
[74], [87], [122]. For instance, for p = 1 with centered flux 
we have that 

12 Vk 


Mo = e 
X 8+ v40 ckôVk 

Therefore, the LF2 algorithm has a semi-infinite stability 
region which may not be bounded with respect to a value 
of the real axis, depending on how the method is initialized 
[110]. This relieves this method from some bounds in material 
properties, and additional algebraic constraints that are present 
for methods with closed-stability regions [123]. 

3) Higher-order LF: A N-th order leapfrog (LFN) time in- 
tegrator applied to DG is discussed in [104]. These techniques 
can obtain high-order convergence in time avoiding the use of 
larger stencils. Thus, these methods allow us to retain the high- 
order convergence in the fully discretized numerical scheme. 
They also allow larger time steps than for LF2, while retaining 
their symplectic non-dissipative nature. However, they require 


(47) 


N/2 times more memory storage and N — 1 times more 
arithmetic operations per time step than LF2. This is the main 
reason why their usage is not widespread when higher-order 
convergences are demanded, where the LSERK is the most 
typical approach. 


B. Low Storage Explicit Runge Kutta time integration 


1) The LSERK4 scheme: The low-storage five-stage fourth- 
order Explicit Runge-Kutta method (LSERK4) [24], [111], 
[124] allows us to achieve a fourth-order convergence in the 
time integration, storing only one additional unknown per 
degree of freedom. For a given vector representing the state 
of an element k, i.e. p(t) = p} we can find an approximate 
solution state p.(t + At) = pit! applying the following 


TABLE IV: Comparison of different time integration methods 








LF2 LSERK4 Tailored LSERK(m,n)* SSP-RK(m)’ STDG 
Order of convergence h2 h4 h? he he Pett 
Explicit form Yes Yes Yes Yes Pseudo 
Dissipative No Yes Yes Yes Yes/No 
Stages stored 14 or 1.5¢ 2 2 m 2 or pe +1 
Stability region Semi-infinite Closed Closed’ Closed’ Semi-infinite 


LTS 
IMEX 





Yes [20], [74] 
Usually CN2 [62], [118] 





Yes [20], [116] 
Implicit RK [119] 





Not demonstrated 
Not demonstrated 








Not demonstrated 
Not demonstrated 





Pseudo [117] 
Pseudo 
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“LSERK with m stages and order n; as described in [111], [112]. 
>With m stages; as described in [71], [113]. 

“Non dissipative implementations [114], [115]. 

4With centered fluxes. 

‘With upwind fluxes. 


‘And customized to fit the eigenvalues of the DG operators [111], [112], therefore allowing larger hy than LSERK4. 


8Larger than with LSERK4, allowing larger he. 


algorithm, using the notation introduced in (31) 


pO =p", 


r® = ar) + At Wp) AE yp Yt , 


f 
p = p&-Y + dr, 
pt» = p® (48) 
with i € [1,...,5] and the coefficients a;, b; and c; taking 


the values indicated in Table V and r being the residue. The 
LSERK4 scheme is one of the most used methods in high— 
order Discontinuous Galerkin semi—discretizations because of 
its low dispersion and dissipation errors. Contrary to other 
RK implementations, the low-storage version requires the 
storage of only twice the number of degrees of freedom in 
the scheme at the expense of one additional stage. Despite its 
many advantages, LSERK4 has a higher computational cost 
than LF2, and the numerical dissipation it introduces can be a 
factor depending on the application. For this reason, a number 
of authors have proposed alternatives for the classical LSERK4 
scheme [71], [111], [112]. 


TABLE V: Coefficients for the low-storage five-stage fourth- 
order Explicit Runge-Kutta method (LSERK4) 




















s as bs Cs 
1 0 1432997174477 0 
9575080441755 

2 _ 567301805773 5161836677717 13612068292357 
1357537059087 13612068292357 9575080441755 

3 _ 2404267990393 1720146321549 22526269341429 
2016746695238 2090206949498 6820363962896 

4 _ 3550918686646 3134564353537 2006345519317 
2091501179385 4481467310338 3224310063776 

5 _ 1275806237668 2277821191437 28032321613138 
842570457699 14882151754819 2924317926251 





2) Convergence and spectral properties: RK methods are 
constrained by the spectra of the operator Wj; i.e. all the 
eigenvalues of WV; must lie inside the stability region of the 
RK scheme (Fig. 6). The LSERK4 method allows for slightly 
larger time steps than LF2 but imposes constraints when deal- 
ing with dispersive materials [123]. LSERK methods comprise 


irregular closed loci in the complex plane [111], [125] in 
which the eigenvalues of (28) must lie to ensure stability. 
Consequently, the timestep must be chosen sufficiently small, 
e.g. for a nodal basis, the following inequality must hold [24] 


Ahi 





Gs 
Ahı k < — min 
, Ged 


(49) 


where min; Ah;,; indicates the minimum distance between 
nodes in element k, c is the maximum speed of light in the 
element k, and C is a constant. 


C. Achieving larger global timesteps 


The presence of small-sized elements imposes constraints 
to the maximum size of the time step, severely affecting the 
computational efficiency. As we can appreciate from (46) and 
(49), the maximum time step allowed to ensure stability is 
proportional to an inverse power of the size of the elements. 
This has been a topic of intense research aiming to overcome 
the global limitations imposed by a local condition. Some 
existing solutions are discussed below. 

1) Local Time Stepping (LTS): The most straightforward 
approach to deal with the global timestep restrictions imposed 
by the presence of local small-sized elements is to devise a 
LTS technique by which these elements are evolved using 
a smaller timestep. To do so, the elements are clustered in 
different groups, or tiers, according to the maximum timestep 
allowed by the smallest element in the tier. The interfaces 
between elements within the same tier are treated in the 
usual way; however, the interfaces between tiers will require a 
special treatment because the smaller tiers require field values 
that the larger steps do not compute. 

For the LF2 method, we can find at least three alternatives: 


e In [74], [120] and [90], the authors use a method, 
initially devised by Montseny, in which the last available 
field coefficient is used when the smaller tiers require 
intermediate time-step values from larger tiers. The main 
advantage of this technique is that it preserves the re- 
versibility of the scheme and in consequence, the scheme 
remains non-dissipative. On the other hand, it introduces 
some additional numerical dispersion, and a penalization 
of the stability condition. 


e In [12] the LTS is accomplished interpolating the field un- 
knowns in an interface region between the different tiers. 
This interpolation improves the accuracy and stability of 
the technique compared with Montseny’s method. 

e In [20] a technique called causal-path LTS (CPLTS) is 
applied. This technique consists of computing auxiliary 
fields in a shrinking buffer zone whenever they are needed 
by the smaller time-step tier. Once they have been used by 
the smaller tier, they are casted away and the higher tier 
is evolved using the original values. As shown in [20], the 
scheme has better dispersive properties than Montseny’s 
and allows for a better assortment of tiers. However, it 
introduces some dissipation and it cannot be used with 
centered fluxes and requires more arithmetic operations. 


For RK methods we can also find several alternatives: 


e The CP-LTS technique previously discussed for LF2 
can also be applied to RK schemes, as discussed in 
[20]. Although the dispersive properties do not seem 
to be significantly affected when low spatial orders are 
used, this technique introduces a significant amount of 
numerical dissipation. 

e A similar concept is shown in [126]. First, the whole 
domain is evolved using the higher-tier timestep. Then, 
the values of the solution that have been polluted by 
the usage of a timestep larger than allowed are cast 
away. To compute the values in the lower-tier region an 
interpolation in the boundary is performed. 

In [24], [127] a scheme allowing each element to be 

advanced with its own individual and optimal timestep 

is shown. This technique, called Arbitrary High-Order 

Derivatives (ADER) consists of expanding the solution in 

Taylor series in time. These time derivatives are then re- 

placed by space derivatives using a Cauchy-Kovalevskaya 

procedure. The resulting scheme is high-order accurate in 
space and time. 


2) Implicit-Explicit (IMEX) schemes: Another technique to 
improve the global efficiency of the scheme is to use an 
implicit time integrator in the regions presenting a higher 
stiffness while using a usual explicit time integrator in the 
remaining domain. This approach aims to benefit from the un- 
conditional stability that is usually a characteristic of implicit 
schemes. However, as opposed to LTS, these techniques cannot 
be used recursively, and a large number of unknowns of the 
implicit part can reduce the computational benefits for meshes 
with highly disparate sizes. In [119], implicit and explicit RK 
schemes are applied to several types of PDEs. In [62], [118], 
the authors show an IMEX technique applied to Maxwell’s 
equations using a second order Crank-Nicolson (CN2) scheme 
for the implicit part and a LF2 scheme for the explicit one. 

3) Predictor-Corrector time integration: A predictor- 
corrector scheme is an algorithm that proceeds in two steps. 
First, the prediction step calculates a rough approximation. 
Second, the corrector step refines the initial approximation 
using another means. In [128] an application of a predictor- 
corrector scheme is proposed and in [129] is applied to a 
DGTD method to solve Maxwell’s equations. This lets the 
authors significantly increase the timesteps compared with 


other methods at the expense of a moderate increase in 
memory. 

4) Tailored LSERK schemes: In [111], [112] the authors 
explore the usage of higher number of stages and different 
orders for new LSERK schemes. The approach they take is 
to make assumptions over the form of the spectrum based 
on several typical cases, and then find coefficients for the RK 
schemes by fitting their stability regions to that spectrum. They 
conclude that the increase in the size of timesteps offsets the 
inclusion of new stages, and therefore they are able to obtain 
improvements of up to a 40 — 50%. 

5) Strong Stability Preserving RK (SSP-RK) schemes: In 
[71], [113] a Strong Stability Preserving Runge-Kutta (SSP- 
RK) technique is used. This scheme has a larger stability 
region, thus allowing us to use a larger h;. At each time step, 
the method needs to evaluate m stages achieving an m order of 
convergence. In [71], the authors demonstrate an improvement 
in the number of operations needed by the scheme. The main 
drawback of this method is that it needs to store m stages, 
thus significantly increasing the memory consumption. 

6) Space-time Discontinuous Galerkin (STDG): As we 
have seen in previous sections, the typical approaches consist 
on obtaining a DG spatial semi-discretization that is then 
evolved with an explicit time integrator algorithm. On the 
contrary, the STDG approach consists of applying the DG also 
in the time dimension [21], [114], [115], [117], [130], [131]. 
The resulting scheme, therefore, extends to the time dimension 
most of the properties of spatial DG, such as the high- 
order convergence. There are many ways of implementing this 
concept; some approaches arise to non-dissipative [114], [115] 
schemes, others result in pseudo-explicit methods [21], [117], 
[130], or allow a significant freedom in the election of the 
time step [21]. 


VI. ELECTROMAGNETIC SOURCES 


Electromagnetic sources in DGTD are almost a direct exten- 
sion of the techniques already developed for FDTD [132]. For 
instance, those based on Huygens’s principle [7], [133] employ 
a division of the computational domain into two zones, the 
Total Field Zone (TFZ) and the Scattered Field Zone (SFZ) 
to permit the simulation of field incidence, being the main 
differences between the FDTD and DGTD implementations 
due to the staggered nature of FDTD. 


A. Plane wave 


Let us consider the case of a known incident wave propa- 
gating in a specific region as the one shown in Fig. 7. The 
implementation of Huygen’s principle on a surface where 
the field values are E'°(t) and H'**(t) yields an equivalent 
problem for which the fields are null outside that surface (SFZ) 
and propagates inside it (TFZ). This discontinuity is naturally 
implemented in DGTD [7], [27] through the jumps (21) used 
to calculate the flux across the face of an element k, which 
need to be modified, just in the TF region, according to 


ne x Ert (t) = ny x (B+) ae grea) 
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Fig. 7: Scattered field (orange) and total field regions (blue). 
The elements that need to have altered fluxes are marked in 
darker colors. 


If k is in the SF region the fields calculated at the face 
interfacing with the TFZ are modified to 





f x Ee) = fp xX (E+) — meg] 
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Note that the fields Æ!" and #'"°, can describe any kind of 
waves such as plane waves, linearly or elliptically polarized, or 
even spherical waves with minor modifications. In FDTD, the 
TFZ and SFZ are separated by one cell introducing some nu- 
merical errors. On the contrary, for DGTD, the discontinuous 
nature of the method allows us to use the same geometrical 
surface as the TFZ/SFZ interface. Moreover, the SFZ can be 
pushed directly onto the computational domain and be backed 
with an SMA-BC. On this surface we can also apply a near-to- 
far-field technique [134] which can also be applied to compute 
RCS or radiation patterns of antennas. 


B. Local sources and radiation patterns 


The most obvious way to model a point current source 
J.(t, Fo) is by directly modifying the magnetic field corre- 
sponding to the node in the position 7 shared by it [27], 
[135]. Therefore using 


H(t) = i x Ja(Fo, t) (52) 


in (50) and (51). However, as pointed out in [27], [135] the 
Lagrange polynomials are not able to resolve the field values 
well in the vicinity of 7, where the fields are theoretically 
infinite, forcing the refining of the mesh around that point, 
and thus increasing drastically the computational cost. This 
justifies the introduction of an alternative solution, through 
the use of localized sources, described below. If we use a 
formalism similar to the defined for the TFZ/SFZ illumination, 
we can avoid defining the fields in the region closest to the 
point source [7], [135]. Therefore, for a dipole we could use 
the analytical expressions describing the fields E(t) and 
Ä inc(t) at the position of the interface. These expressions are 
evaluated using theoretical equations such as the ones that can 
be found in [136] for an electric dipole. The advantage of this 
technique is that the field at the interface can be defined freely 
so it is possible to use it to define antenna radiation patterns 
including near fields. 


C. Waveports 


1) TEM port: A TEM mode (e.g. for a coaxial port) can 
be directly injected into the port in a weak manner through 
the flux terms by adding Ehin and >in to the jump terms 
in (21). For the first coaxial TEM mode, these terms are 


i oi 
log(b/a) p 
ee 
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Erie) = pine (t) 
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a é (53) 


in (50) and (51), with a and b being the inner and outer radii 
of the conductors forming the coaxial and Vi"°(t) the time 
variation of the excitation signal. The TEM ports are accu- 
rately truncated with a SMA boundary condition (described 
in section IV-D3) that can be located in the same surface of 
the port. 

2) Waveguide modes: 

a) Arbitrary shape wave guides: In [50], [66], Lou et al. 
describe a method to excite arbitrarily shaped waveguides in 
a FEMTD scheme that can be simply extrapolated to DGTD. 
To do this, Lou starts by solving the 2D Helmholtz problem 
at the plane forming the waveport to get the eigenvectors and 
eigenfrequencies. Then these FD solutions are solved in TD 
by applying the inverse Laplace transform. This approach is 
also useful in truncating the waveguide in a very efficient and 
accurate manner. The absorption in waveguides is particularly 
problematic when absorbing boundary conditions (ABC) are 
used, because the waves are always imping over the absorbing 
conditions with a high angle when their frequency is close to 
the cut frequencies of the supported modes. 

b) Rectangular waveguide: A simplified version of [50], 
[66] can be implemented directly by using the analytical 
TE and TM modes. For a rectangular waveguide, we have 
the following analytical expressions for the supported modes 
[136], 
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where Ymn(w )= SEE + km Rra = (m2)°+(42)’, and 


n = J p/ée. Bmn(w) and Ayn (w) are the spectral components 
of the mode mn. In he Laplace domain, yo = s/c and Ymn = 


\/s?/c? + k2,,,, which in the time domain can be represented 


with the following operators [66]: 
10 


c Ot 
10 
mn = Hin = 75, 


where * stands for convolution in time and the impulse 
response of the system h,,,,(t) is given by [137] 


k 
| Ti Rmnct)u(t) 


w=L= (58) 


+ hmn(t)* (59) 


hmn (t) (60) 


u(t) denoting the unit step function and Jı(-) the Bessel 
function of the first kind. If we are using a Gaussian excitation 
for the mode Amn = f(t) of the form 


f(t) = ult) exp |- (<2) ] 


we can write eq. (58) and (59), using a numerical convolution 
technique, as 


(61) 








ny _ 1 [ Of@) 
yo f(t") = E ej (62) 
Ym f(t) > a 5 (= = + NY tn (tE) 
(63) 


that enables the computation of (54), (55), (56) and (57), in 
the time domain. 


VII. ADVANCED MATERIAL MODELING 
A. Conformal Perfectly Matched Layer 


Perfectly Matched Layers (PMLs) were introduced for the 
first time in [138] as a way to truncate the computational 
domain in open-region scattering problems. They can actu- 
ally be seen as a special kind of non-Maxwellian dispersive 
anisotropic material [33]. The main advantage of PMLs over 
other ABC is that they are largely independent of frequency, 
wave polarization, and angle of incidence. They also have ex- 
tremely small reflection errors. PMLs are material-independent 
and can truncate domains with inhomogeneous, dispersive, 
and non-linear materials. There are several variants of PMLs 
[139]-[144], mainly developed in the FDTD context, with 
features particularly well-suited for different applications: e.g., 
in [139] the author presents an Auxiliary Differential Equation 
(ADE) form of a multi-pole Complex-Frequency Shifted PML 
that presents advantages when it is extended to high-order 
methods. Equivalent convolutional formulation can also be 
used [145]. 

The Conformal PML (CPMLs) allow us in DGTD to add 
PMLs extruding the outer surface of the computational domain 
[12], [144]. The only geometrical restriction to this formula- 
tion is that the PMLs must form a convex-closed region when 
viewed from the outside, or they will be dynamically unstable 
[146]. 

Let us consider the setup of Fig. 8 representing a right- 
handed reference frame called a Darboux frame at a point 
P of an internal surface S. This frame is defined by an 
orthonormal local vector-basis u1, U2 and u3. uy and ug 





S 


Fig. 8: Darboux frame. 


which are tangent to S at a point P along the principal lines 
of curvature. The third component is obtained from the other 
two as u3 = u; X Ug. We can write u; in terms of local 
coordinates €; as u; = (07/0€;)/|Or/0€;|,i = 1, 2,3 where 7 
is the position vector. With these definitions €; = 0 represents 
the surface S. The unit vectors are functions of é and & only. 
Within the defined local reference frame, the radii of curvature 
ro1(&1, €2) and ro2(&1, €2) are positive (for convex S) and we 
also have that at a point P’ the radius r; can be expressed as 
73(€) = roj(€1,€2) + &. 

We will also use the the Lamé coefficients, h; which in the 
Darboux frame are 


hı = r1/To1 (64) 
h2 = T2/ro2 (65) 
h3=1 (66) 


The conformal PML? can be obtained through a complex 
stretching on the normal coordinate €3: 


_ &3 
Ate | s(€)dé 


= [* (ae + 


= (3) + aes) 
jw 

where a > 1 and o > 0. The effect of this stretching on a 
propagating wave can be seen by locally expanding the wave 
in terms of a generalized Wilcox expansion [146], [148], [149] 
showing that there is an induced exponential decay along the 
normal coordinate for 0 > 0. Also, if a > 1, additional 
attenuation can be achieved for evanescent waves, if any. 

The derivation of the CPML consists of substituting (67) 
in the system of Maxwell ’s equations expressed in curvilinear 
coordinates [150]. This leads to a system that is substantially 
different to the classic Maxwell’s equations. For this reason, 
rather than to solve the system of Maxwell’s equations in 


(67) 





2A Complex Frequency Shifted (CFS) formulation can be straightforwardly 
found [139], [147]. This presents some benefits for the attenuation of low 
frequency waves. 


curvilinear coordinates, we recover the original system intro- 
ducing an anisotropic medium. This leads to the formulation 
of the anisotropic and conformal PMLs. 

1) Anisotropic Conformal PML: Let us start by introducing 
a new set of fields E; and H; defined as 


y h A h 2 
Éi =E, Ē= FE, FEs=sE3 (68) 
hı ho 
i ss Ds : 
Hı = —H;, H=- Hə, Hz = sHs (69) 
hy ho 
with 
iy = t (70) 
rol 
ho = a (71) 
E T02 
hasi (12) 


and Ti = foi + &3. 

By introducing these new fields into the system of equations 
described previously, we recover a Maxwellian system of 
equations in curvilinear coordinates, but now for an anisotropic 
medium whose constitutive parameters are given by fi = pA 
and = = cA, with 


= hıh hyh hih 
A= uu a a + U2U2 2 k 2 + U3U3 ie 
hiho hiho shiha 
(73) 
So we can write Maxwell’s equations in PML media as 











Vx E=-jijwH 
V x Ë = ŽjwĒ 


(74) 
(75) 


This means that we can achieve a reflection-less absorption 
of electromagnetic waves incident on a smooth, concave 
surface with an anisotropic constitutive tensor over the volume 
spanning S between S’. 

2) Cartesian PML: The Cartesian PML is a particular case 
of the system described in VII-A1. In this case we have that 
To1 = To2 = œ, so that hy = = hy = = 1. Let us assume that 
the normal to our surface S is oriented towards the +z axis, 
and therefore, uj = Uz, Ug = Uy, and ug = uy. Let us 
consider that s = s,(z) = 1+ 0,(z)/(jw) for attenuation in 
the z direction. The o,(z) profile is taken to minimize the 
reflections [12], [138] that for the parabolic case takes the 
form 


ote) om (E) 


Az being the thickness of the PML. Uniaxial and biaxial 
PMLs can be considered a special case of the triaxial PML. 
By writing them explicitly we see that we need only certain 
DOFs depending on the direction, a fact that can be used with 
nodal basis to reduce the storage needs as mentioned in Section 
IV-F2. 

a) Uniaxial Cartesian PML: For uniaxial PMLs, (73) 
reduces to 


(76) 


5 1 
A, = UgUgS, + UyUys, + uu, (77) 


z 


The x component of Faraday’s law in PML media (74) is 


—[V x Ëj, = us, juH, = jwpH, + csu, (78) 
or, in the time domain 
pðË, = -|V x Ël, — ozu Ë, (19) 
and similarly for the y component. The z component is, 
E iwH, 
Avene ky 
Sz 
jwH, + oH, = oH. 
1+ ja (80) 
ry z j Ä, 
OH ie es 
JW t+ Oz 
= ujwH, E uM, = yo, H, 
with 
juM, =-—o0,M,+ oH, (81) 
which, in the time domain 
worl, = —|V x Ë), — uM, + uo.H, 
a,M, = —0,M, + 02H, (82) 


Similar equations can be obtained for Ampere’s law (75) for 
PML media. Eq. (82) shows the need of introducing a new 
equation, called Auxiliary Differential Equation (ADE) that 
governs the behavior of a polarization current, M. This is 
a common feature between PMLs and Dispersive materials, 
described in Section VII-B. 

b) Biaxial Cartesian PML: When stretching in x and y 
directions, the tensor in (73) can be expressed as, 


Any = Ac(a)Ay (v) 


s 

y 

= gee + uy ty + SzSyUzUz 
x Sy 


(83) 


The x component of Faraday’s law, after following a similar 
procedure as in the previous section, is 


uwO,H, = —[V x Ele — wloy — or) He — 
3M, = —0z(0y — ox) Hy —~o,M, 


uM, (84 
(85) 


the y component can be obtained by switching x and y 
components in the previous expression. The z component is 


aÑ, = —[V x Ë), E Ulz + Oy) Hy — uM, 
ðM, = O,0,M, (86) 
And similarly expressions can be obtained for Ampere’s law 


(75). 
c) Triaxial Cartesian PML: The tensor (73) stretched on 
all directions arise the most general form 





Raye = An ()Ay(y)Az(2) (87) 
The x component of TEA law can be expressed as, 
-[V x Ele = 
= an + Moz + Oy — o1) Hy 
(oz —Ox)(Oy — Ox) 3 (88) 
+u e H, 
JW + Or 


= jw, + woz + Oy — Oz) Hy + pM, 


with 


jwM, =-—o,M,+ (Tz — Or)(0y — oz) Hy (89) 


which in the time domain 
ði, =-|V x Ele — ploz + oy — o2)Hy = uM, 


0:M, = —o7My + (oz — Ox) (dy — ox) Hy (90) 


And similarly for the y and z components and Ampere’s law 
(75). 

3) Constant/Varying conductivities: The equations (82), 
(86), and (90) together with the conductivity profile (76) 
require us to define and store additional mass matrices that 
are modified by the conductivities involved. This is a substan- 
tial amount of additional memory that will also impact the 
performance of the simulation. For certain cases, the benefits 
of using the conductivity profile (76) are clear as was found 
n [12], [138]. If, rather than using a varying conductivity 
we choose a constant profile, there is no need to compute 
additional mass matrices and equations (82), (86), and (90) are 
significantly simplified. This, however, results in an increase 
in the energy reflected which may be a factor, depending on 
the application. 

Note also that the PMLs may have some stability problems 
if we are not careful when choosing large values for con- 
ductivity [123], particularly in time-integration schemes with 
closed stability regions (shown in Table IV). A discussion on 
this issue will be carried out in section VII-B2. 


B. Dispersive materials 


The simulation of dispersive media requires the introduction 
of new DOFs. This makes DGTD particularly well-suited for 
the simulation of these media because, as discussed in Section 
IV, its higher convergence properties let us attain a better ac- 
curacy per DOF than other techniques. There are many models 
available to model dispersive media. The three most common 
are the Debye’s [67], [151], [152], Drude’s [107], [153]- 
[155] and Lorentz’s [156]-[158] models. These models can 
present multiple poles that arise from theoretical arguments of 
material electromagnetic properties. In this Section, we show 
how to adapt the complex-conjugate pole-residue pairs model 
(CCPR) proposed and demonstrated in [159], [160] for the 
FDTD technique. An interesting feature of the CCPR model 
is that it encompasses the other three models as they can be 
expressed as particular cases of it. Another important feature 
of the CCPR model is that we can use already-known and 
freely-available tools to obtain optimal poles and residue pairs 
for a given set of permittivities or permeabilities [161 ]-[163]. 

1) General Formulation: Let us consider the source-free 
Maxwell’s equations (1) and (2) under the assumption that 
only homogeneous and isotropic media are present and there- 
fore, electromagnetic parameters can be assumed to be local 
and spatially constant. When equations (1) and (2) are stated 
for dispersive media in the FD, the permittivity is a frequency- 
dependent magnitude. Following the approach of [159] we can 
model e(w) as 


R 
elw) = E00 + €0 >, [Xr(w) + x;-(w)] 


r=1 


(91) 


with 

c Cc 

"and yi(w) = —* 
JW — Gy. 








Xr(w) = = (92) 

jw — ar 
where £% € R is the permittivity at an infinite frequency and 
Cr, ar E€ C are parameters chosen such that (91) fits the actual 
permittivity data of the material to be modeled. This fit can be 
done using the vector-fitting (VF) routines proposed in [161]- 
[163]. The number of residues and poles pairs, R, necessary 
to obtain a good approximation depends on the complexity 
of the actual e(w). A necessary condition to ensure that € is 
stable and causal is that the real part of a, is negative. 

Introducing model (91) into equation (2), 


R 
cosx E= x Ë- oË- Y (aP. +a P) 03) 
r=1 
with 


P.=e0yrE and P= ex E (94) 


Considering also that if Ë €R then P = P* we can finally 
rewrite (1) and (93) as a system of R + 2 coupled PDEs. 











R 
OnE = Vx H-—cE-2Y R{a,P. ,Ē 
E Ls a 
O,H =--V XE 
u 
ôP, = ar P, + £06, E Vr=1,...,R (95) 


With this formulation, the most commonly used dispersive- 
media models can be obtained as particular cases: 

e A purely conductive media can be modeled using a single 
residue-pole (R = 1) pair with ag = 0 and cp = a /(2€0). 
This is equivalent to adding a conductivity term cE into 
equation (2). 

e Poles of a Debye’s model can be obtained with cy = 
Aeé,./(27;,) and a, = —1/r,. With Ae, and 7, being the 
parameters characterizing the Debye poles [132]. 
Similarly for Lorentz’s media we have that c, = 
jAerw?/(24/w2 — 82) and ap = —6,—jx/w?2 — 62. With 
AEr, Ôr, and T, being the parameters characterizing the 
Lorentz poles [132]. 

2) Stability of dispersive models: The stability conditions 
for dispersive media have been studied by several authors 
[123], [157], [164] finding that the DG semi-discretized 
scheme is stable for any physically stable model. When we 
introduce dispersion models, we find that the original eigen- 
values that we obtained (28) are affected by the new equations 
in (95). The new eigenvalues or the modification of the existing 
ones may make them move out of the stability regions (Fig. 6) 
forcing us to reduce h; to ensure the stability of the scheme. 
For dispersive media [152], [165] and [157] show that the 
DGTD and CGTD schemes with LF2 time integration schemes 
are stable and that their solutions converge. This happens 
because the leap-frog schemes are only unstable depending 
on the imaginary part of the eigenvalues present in Maxwell’s 
equations (95) as discussed in Section V-A. However, when 
we apply the LSERK4 scheme introduced in section V-B the 
new eigenvalues may lead to unstable schemes if the modified 
eigenfrequencies fall out of the closed stability region. 
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Fig. 9: Bistatic RCS of an isotropic/anisotropic sphere (D = 

1.2å and A = 1.0m). LFDG results are compared to those 

appearing in [167], and computed with Ansoft HFSS. 


C. Anisotropic materials 


The DGTD method can be straightforwardly extended to 
anisotropic materials [11], [12], [96], [166]. Substituting € 
and js in (1) and (2) with electric permittivity and magnetic 
permeability symmetric positive-definite tensors £ and {Z. 

We can express £ and ji, and their inverses, in a local base 
of vectors. Following an operation splitting method similar 
as was done in section IV-B, we can again derive a one- 
dimensional Riemann problem to deduce new conditions for 
the numerical fluxes (20). However, the fact that we are using 
tensors leads us to expressions for two matrices, impedance 
(Z2) and admittance (Y2), which play a role equivalent to the 
scalar impedance (Z) and admittance (Y) magnitudes defined 
for the isotropic case. Finally, Zə and Y> are used as the 
impedance values indicated in table II to account for the 
anisotropic nature of the media. The rest of the scheme is 
also affected by the tensorial nature of € and ji but their effect 
is simply to scale the mass matrices (14) depending on the 
tensor values. 

Results for the RCS of an anisotropic sphere are shown in 
Fig. 9 (from [11]). In this problem, a sphere is illuminated 
with a linearly polarized plane wave. The bistatic RCS is 
computed at a frequency for which the diameter is D = 1.2), 
with À being the wavelength. The results are compared with 
a reference case [167] and with the solution provided by the 
Ansoft HFSS commercial software. The maximum difference 
found is 0.35 dB, therefore resulting in a good agreement. 


VIII. SUBCELL MODELS 


Through a modification of the numerical flux conditions 
we can model a wide variety of phenomenons such as lumped 
elements, multi-port networks, or thin layers. 


A. Lumped elements 


The modeling of passive lumped elements such as resistors, 
capacitors, and inductors and general combinations of them 
has been studied in [168]-[170], a generalization of the 
previous works for multi-port networks is carried out in [65], 


[171]. Lumped elements can also be seen as special cases of 
thin layers, which are introduced in the next section. 


B. Thin layers 


Thin layers of any material, including anisotropic and 
dispersive media, are described in [172]-[174] for the FDTD 
method. Specifically for DGTD, a simple resistive layer was 
introduced in [64] and a rigorous formulation and validation, 
suitable also for curved geometries, is shown in [96] and [175]. 
To model thin layers, we use a Surface Impedance Boundary 
Condition (SIBC) that reproduces its behavior. Note also that 
an SIBC defined over a free surface can also be regarded as 
a two-port network model (see Fig. 10). 





Fig. 10: Two-port representation of the air-embedded panel 
illuminated by a TEM plane wave. 





Fig. 11: Magnetic field-controlled circuit representation of a 
thin layer. 


Let us suppose that an indefinite panel embedded in air 
is illuminated by a normally-incident TEM plane wave. An 
equivalent circuit of this setup is shown in Fig. 11 in which 
Éo, Ey, Ay and Ëa are the components of the electric and 
magnetic fields which are tangential to the external faces (0 
and d) of the slab, respectively. Using a two-port transmission 
line formalism we can deduce that the following relationship 
between the field components 


EY gy (E) 
(ae EA io) oo 
with 
®11(w) = Po2(w) = cosh(yd) (97) 
®1o(w) = nsinh(yd) (98) 
P2 (w) = nt sinh(yd) (99) 


with 7 and + being the intrinsic impedance and the propagation 
constant, respectively. Note that these expressions are derived 
from £ and u which can depend on w, as the ones presented 
in Section VII-B. 

Matrix equation (96) can be transformed, using widely 
known two-port network relationships [176], into the following 
equivalent magnetic field-controlled (MFC) formulation: 


Ey] [Zo —-%]| [Ho 
E 7 Zt —La Ha 
~~ 


Sa 
È 2) ii 

Fig. 11 shows the sketch of the circuit model of (100) 
in which the dependence of the electric field at the one 
side of the slab from the magnetic field at the other one is 
represented by MFC electric-field sources. Note that for non- 
symmetrical multi-layered slabs the coefficients ®;; and ®22 
are not coincident, even if matrix [®(w)] always satisfies the 
reciprocity condition; therefore, the impedances Zp and Zq 
can assume different values. The Z can be decomposed using 
a VF technique such as the one we used in Section VII-B, 
using, for instance, the routines provided by [161]-[163] 


(100) 





J bs B 2 
Ž(w) = Žo +) (101) 
fay JW — ap 

being Zo. and Z; an approximation of the impedance 
matrix of the medium. With this decomposition Æ can be 


expressed as 


x P 
E(w) = Qoow) + 55 @p(w) (102) 
p=1 
with z 
E E 103 
a E k ' ) 
and = S 
Oy = er] = Ž, Ï (104) 
Qad,p 


Making use of the ADE formalism, the infinity frequency term 
is x 

Qo (t) = Zx% H (t) 
The frequency dependent terms arise to P new differential 
equations 


(105) 


OQ = 4,Q, + ZH(t) 


that are solved similarly as we did in Section VII-B for 
dispersive materials. The calculated electric fields are used, 
similarly as was done in (23), (24), and (25), to modify the 
jump terms (21) in the following way, 


[[E" ]]siwc,o = 2(E" — Ef) 


(106) 


[E "]sec,o =0 (107) 
and similarly at the d side of the SIBC, 

[[E"Isiwe.a = 2(E" — BE) 

MA” ]]smec,a =0 (108) 


C. Thin wire geometries 


Many common EMC problems often need the evaluation 
of currents flowing along cables. The typical approach is 
to model cables as thin wires that are split into segments 
located along the edges of the cells in the mesh [64]. For each 
segment, the currents and charges are evaluated following an 
implementation of the Holland formalism [177], [178]. These 
equations are discretized for each of the segments following a 
similar formalism as the one explained in section IV, but this 
time for a one-dimensional problem weakly coupled with our 
original semi-discretization [169]. 

A different approach is followed by [179] in which the 
region containing the thin wires is solved using the Time 
Domain Integral Equation that is then coupled with the DGTD 
algorithm by a modification of the numerical fluxes, similarly 
as was done in section VI-A for plane waves. 


IX. COMPUTATIONAL IMPLEMENTATION 


In this section, we present some final remarks regarding 
the computational implementation of a few of the techniques 
previously described. 


A. Geometrical discretization 


An important aspect of an efficient simulation is the capabil- 
ity to generate complex meshes. This requirement is fulfilled 
by most of the commercial CAD tools. 

The tool SEMBA, that we have developed, uses GiD for 
pre- and post-processing. GiD [101] is a commercial tool 
that allows pre-processing of geometries with CAD-importing 
capabilities. These geometries can be meshed in a variety 
of ways, including structured and semistructured meshes, 
linear or quadratic elements, and several types of elements 
(tetrahedrons, hexahedrons, prisms, ...). The program permits 
a high degree of customization that allows users to develop 
their own problem types. Additionally, the results obtained 
can be easily visualized, and several post-processing tools are 
also offered. 

There are many other applications that can offer solutions 
for obtaining meshes. Among the open source tools, we 
highlight Gmsh [100] and OpenFoam [180]. 


B. Preprocessing 


To increase the efficiency of the computations and imple- 
ment certain capabilities, it is necessary to perform some pre- 
processing tasks. These tasks are usually optional as they 
depend on the capabilities implemented in the solver. 

1) Selection of basis functions: An a priori hp-refinement 
heuristic strategy consists of choosing the size of the mesh, 
and the order of the basis function in each tetrahedron [12]. 
The target is to ensure a given accuracy level, minimizing 
the computational cost. The selection of the mesh size has 
to be made in the mesh-generation process, since there is 
an optimum element size that minimizes the computational 
cost for a required accuracy. In real meshes, the element sizes 
vary throughout the computational domain, and the accuracy 
is finally adjusted with the selection of the order p. This allows 
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(a) Example of distribution of time steps in a real problem 
(described in [10]). The choice of the time step and the time 
steps of different levels have been included in the plot. The 
estimated average time step was 88.5 ps, compared to the 
minimum 10.5 ps [12]. 





(b) Elements evolved with the doubling of the minimum 
timestep (Tier 1) using an LTS technique for LF2 [20]. 


Fig. 12: Assortment of tiers for the use of a LTS technique. 


us to employ a higher-order basis for larger tetrahedrons, and 
lower orders for smaller ones. This approach can also combine 
gradient spaces of reduced order p — 1, with rotational spaces 
of complete order p [82]. 


It bears noting that smaller elements need shorter timesteps, 
but if lower orders are used in these elements, the stability 
condition is also relaxed. The combination and mixing of 
different orders of the basis functions depending on element 
size makes the timestep among all the elements more homo- 
geneous, thereby reducing the number of levels required for 
the LTS algorithm. 


2) LTS Level classification: The local time-stepping strat- 
egy described in Section V-C1 requires a classification of all 
the elements according to their maximum timestep. Fig. 12 
illustrates the distribution of the timesteps for the elements 
in a real problem. As there are usually some costs associated 
with the buffering zones between time tiers [12], [20], [22], 
[126], the minimum timestep can be actually tuned to provide 
a maximum average timestep. 
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Fig. 13: Distribution of the mesh among the different MPI 
processes. 


C. Parallelization 


One key advantage of discontinuous Galerkin methods is 
their simplicity for the parallelization in memory-distributed 
hardware architecture, a feature that arises from its explic- 
itness. This allows us to make use of the Message Passing 
Interface (MPI) standard [105], [107], [181] or the GPU 
(CUDA/OpenCL) [120], [182], [183]. The DGTD method 
exhibits great boosts in performance thanks to its memory 
locality, the regularity of access patterns and the high arith- 
metic intensity [183]. There are several ways to perform the 
partitioning of the mesh carried out during the pre-processing 
stage. The simplest way is to manually define regions that 
are handled by the different processes (Fig. 13). However, 
as pointed out in [120], this may result in a load unbalance 
that drastically reduces the efficiency. To solve this issue, the 
ParMetis library [184] can be used to partition the mesh, 
assigning different weights to the cells depending on the 
number of arithmetic operations they need. ParMetis can also 
be configured to provide the partition with the minimum 
interface, to optimize the interprocess communication. Other 
techniques developed for FDTD can also be used to reduce 
the number of interprocess communications [185]. 
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